Author: Kaitlin Volk; with edits by Jeff Cegan
Date: May 19, 2021
Project HAB with Jodi Ryder

Load and filter data.

Summary of data:

##       loc_id        reservoir                     sample_num     
##  NRR20001: 60587   Length:1070160     200908190900000.0:     57  
##  GRR20001: 58858   Class :character   201508171200000.0:     56  
##  TAR20001: 57385   Mode  :character   199208181200000.0:     52  
##  EFR20001: 53021                      201409231200000.0:     51  
##  CRR20001: 47639                      199206161200000.0:     48  
##  CCK20001: 45230                      199207281200000.0:     48  
##  (Other) :747440                      (Other)          :1069848  
##   sample_date              yr                 mo                depth       
##  Min.   :1965-04-18   Length:1070160     Length:1070160     Min.   :  0.00  
##  1st Qu.:1988-09-13   Class :character   Class :character   1st Qu.: 10.00  
##  Median :1997-07-29   Mode  :character   Mode  :character   Median : 22.00  
##  Mean   :1996-12-28                                         Mean   : 26.73  
##  3rd Qu.:2007-07-31                                         3rd Qu.: 40.00  
##  Max.   :2019-12-12                                         Max.   :999.00  
##                                                                             
##      param            value              units       
##  temp_c :461627   Min.   :  -99.00   deg C  :450676  
##  DO_mgl :419019   1st Qu.:    3.95   mg/l   :421756  
##  DO_perc: 67556   Median :   11.00   mg/L   : 82011  
##  TDS    : 29568   Mean   :   20.31   %      : 63754  
##  P_total: 16284   3rd Qu.:   21.10   ug/l   : 18289  
##  C_total: 16237   Max.   :40300.00   Deg C  : 10950  
##  (Other): 59869                      (Other): 22724

Exploring ChlA data with Plots

ChlA is an important variable for identifying HABs and for the correlations in 1.6 of the SOW
Two methods have been used: trichromatic and Spectrophotomatic. Is one better? Should they both be used?

###Look for outliers/weirdness in the variables

## Warning: Removed 11930 rows containing non-finite values (stat_qq).

## Warning: Removed 54539 rows containing non-finite values (stat_qq).

## Warning: Removed 472111 rows containing non-finite values (stat_qq).

## Warning: Removed 406000 rows containing non-finite values (stat_qq).

## Warning: Removed 461332 rows containing non-finite values (stat_qq).

## Warning: Removed 457343 rows containing non-finite values (stat_qq).

## Warning: Removed 457302 rows containing non-finite values (stat_qq).

###Dive into total phosphorous (P_total)

###Clean the data

###Visualize the cleaned data

## Warning: Removed 11900 rows containing non-finite values (stat_qq).

## Warning: Removed 85802 rows containing non-finite values (stat_qq).

## Warning: Removed 471962 rows containing non-finite values (stat_qq).

## Warning: Removed 405852 rows containing non-finite values (stat_qq).

## Warning: Removed 461200 rows containing non-finite values (stat_qq).

## Warning: Removed 457211 rows containing non-finite values (stat_qq).

## Warning: Removed 457441 rows containing non-finite values (stat_qq).

##Curious about how many reserviors and sites per reservoir there are:

## # A tibble: 51 x 2
##    reservoir n.sites
##    <chr>       <int>
##  1 BBL             5
##  2 BGR             1
##  3 BHR            30
##  4 BL1             1
##  5 BNV             4
##  6 BPR             3
##  7 BRR            78
##  8 BVR            28
##  9 BWR             2
## 10 CBR            15
## # … with 41 more rows

1.2 Time series plots for a single location

## `geom_smooth()` using formula 'y ~ x'

## `geom_smooth()` using formula 'y ~ x'
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 18102
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 4.02
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 0
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : at 18106
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : radius 0.0004
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : all data on boundary of neighborhood. make span bigger
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 0.0004
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : zero-width neighborhood. make span bigger
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf in foreign function call (arg 5)

## `geom_smooth()` using formula 'y ~ x'
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : span too small. fewer data values than degrees of freedom.
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : at 18102
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : radius 0.0004
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : all data on boundary of neighborhood. make span bigger
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 18102
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 0.02
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 1
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 16.16
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : zero-width neighborhood. make span bigger
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf in foreign function call (arg 5)

## `summarise()` has grouped output by 'loc_id'. You can override using the `.groups` argument.
## Warning in `[<-.factor`(`*tmp*`, list, value = 0): invalid factor level, NA
## generated
## `summarise()` has grouped output by 'reservoir'. You can override using the `.groups` argument.
## Warning: All formats failed to parse. No formats found.
## `geom_smooth()` using formula 'y ~ x'
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : span too small. fewer data values than degrees of freedom.
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : at 18102
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : radius 0.0004
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : all data on boundary of neighborhood. make span bigger
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 18102
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 0.02
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 1
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 16.16
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : zero-width neighborhood. make span bigger
## Warning: Computation failed in `stat_smooth()`:
## NA/NaN/Inf in foreign function call (arg 5)
## Warning: Removed 385 rows containing missing values (geom_text).

## `mutate_all()` ignored the following grouping variables:
## Column `reservoir`
## Use `mutate_at(df, vars(-group_cols()), myoperation)` to silence the message.

1.3 Characterizing reservoirs based on last five years of summer months (may-sept)

         Oligo   Meso    Eutro  

TP (ug/L) 8 27 84
TN (mg/L) .66 .75 1.9
ChlA (ug/L) 1.7 4.7 14

1.4 Assign reservoirs to mixing regime====

```{r. echo=FALSE}


## 1.5 Apply Mann-Kendall test to time series to find increasingtrends====
```{r. echo=FALSE}

1.6 Cross correlation of time series (esp. chla, our bloom indicator )

```{r. echo=FALSE}

```